Overview

Dataset statistics

Number of variables38
Number of observations78096
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory22.6 MiB
Average record size in memory304.0 B

Variable types

Numeric11
Categorical27

Warnings

X3 has a high cardinality: 77404 distinct values High cardinality
Y3 has a high cardinality: 77405 distinct values High cardinality
Z3 has a high cardinality: 77405 distinct values High cardinality
X4 has a high cardinality: 74974 distinct values High cardinality
Y4 has a high cardinality: 74974 distinct values High cardinality
Z4 has a high cardinality: 74975 distinct values High cardinality
X5 has a high cardinality: 65074 distinct values High cardinality
Y5 has a high cardinality: 65072 distinct values High cardinality
Z5 has a high cardinality: 65074 distinct values High cardinality
X6 has a high cardinality: 52247 distinct values High cardinality
Y6 has a high cardinality: 52246 distinct values High cardinality
Z6 has a high cardinality: 52247 distinct values High cardinality
X7 has a high cardinality: 38943 distinct values High cardinality
Y7 has a high cardinality: 38944 distinct values High cardinality
Z7 has a high cardinality: 38944 distinct values High cardinality
X8 has a high cardinality: 30563 distinct values High cardinality
Y8 has a high cardinality: 30563 distinct values High cardinality
Z8 has a high cardinality: 30563 distinct values High cardinality
X9 has a high cardinality: 23969 distinct values High cardinality
Y9 has a high cardinality: 23969 distinct values High cardinality
Z9 has a high cardinality: 23969 distinct values High cardinality
X10 has a high cardinality: 14754 distinct values High cardinality
Y10 has a high cardinality: 14754 distinct values High cardinality
Z10 has a high cardinality: 14754 distinct values High cardinality
Y0 is highly correlated with Z0High correlation
Z0 is highly correlated with Y0High correlation
Y1 is highly correlated with Z1High correlation
Z1 is highly correlated with Y1High correlation
Y2 is highly correlated with Z2High correlation
Z2 is highly correlated with Y2High correlation
Y0 is highly correlated with Z0High correlation
Z0 is highly correlated with Y0High correlation
Y1 is highly correlated with Z1High correlation
Z1 is highly correlated with Y1High correlation
Y2 is highly correlated with Z2High correlation
Z2 is highly correlated with Y2High correlation
Y1 is highly correlated with Y0 and 3 other fieldsHigh correlation
Y0 is highly correlated with Y1 and 3 other fieldsHigh correlation
X1 is highly correlated with Y1 and 3 other fieldsHigh correlation
Y2 is highly correlated with Y1 and 3 other fieldsHigh correlation
Z0 is highly correlated with Y0 and 4 other fieldsHigh correlation
Z1 is highly correlated with Y1 and 1 other fieldsHigh correlation
Z2 is highly correlated with Y2 and 1 other fieldsHigh correlation
User is highly correlated with Z0High correlation
X2 is highly correlated with Y2 and 3 other fieldsHigh correlation
Class is highly correlated with X11 and 2 other fieldsHigh correlation
X11 is highly correlated with X1 and 4 other fieldsHigh correlation
Y11 is highly correlated with X1 and 4 other fieldsHigh correlation
Z11 is highly correlated with X1 and 4 other fieldsHigh correlation
X0 is highly correlated with Y0 and 1 other fieldsHigh correlation
X11 is highly correlated with Y11 and 1 other fieldsHigh correlation
Y11 is highly correlated with X11 and 1 other fieldsHigh correlation
Z11 is highly correlated with X11 and 1 other fieldsHigh correlation
User has 9049 (11.6%) zeros Zeros

Reproduction

Analysis started2021-09-24 09:30:00.505929
Analysis finished2021-09-24 09:30:52.057376
Duration51.55 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Class
Real number (ℝ≥0)

HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.983737964
Minimum0
Maximum5
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size610.2 KiB
2021-09-24T17:30:52.120527image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.421183462
Coefficient of variation (CV)0.4763097429
Kurtosis-1.299457641
Mean2.983737964
Median Absolute Deviation (MAD)1
Skewness0.01431789567
Sum233018
Variance2.019762433
MonotonicityNot monotonic
2021-09-24T17:30:52.201127image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
316344
20.9%
116265
20.8%
515733
20.1%
214978
19.2%
414775
18.9%
01
 
< 0.1%
ValueCountFrequency (%)
01
 
< 0.1%
116265
20.8%
214978
19.2%
316344
20.9%
414775
18.9%
515733
20.1%
ValueCountFrequency (%)
515733
20.1%
414775
18.9%
316344
20.9%
214978
19.2%
116265
20.8%
01
 
< 0.1%

User
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.959127228
Minimum0
Maximum14
Zeros9049
Zeros (%)11.6%
Negative0
Negative (%)0.0%
Memory size610.2 KiB
2021-09-24T17:30:52.277566image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15
median9
Q312
95-th percentile14
Maximum14
Range14
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.697810372
Coefficient of variation (CV)0.5902418994
Kurtosis-1.122911826
Mean7.959127228
Median Absolute Deviation (MAD)4
Skewness-0.4806113169
Sum621576
Variance22.06942229
MonotonicityIncreasing
2021-09-24T17:30:52.356829image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
109573
12.3%
09049
11.6%
138739
11.2%
118061
10.3%
147495
9.6%
86811
8.7%
55105
6.5%
124865
6.2%
14717
6.0%
24513
5.8%
Other values (4)9168
11.7%
ValueCountFrequency (%)
09049
11.6%
14717
6.0%
24513
5.8%
4379
 
0.5%
55105
6.5%
64377
5.6%
7492
 
0.6%
86811
8.7%
93920
5.0%
109573
12.3%
ValueCountFrequency (%)
147495
9.6%
138739
11.2%
124865
6.2%
118061
10.3%
109573
12.3%
93920
5.0%
86811
8.7%
7492
 
0.6%
64377
5.6%
55105
6.5%

X0
Real number (ℝ)

HIGH CORRELATION

Distinct78087
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.34566383
Minimum-108.5527385
Maximum190.0178353
Zeros1
Zeros (%)< 0.1%
Negative6643
Negative (%)8.5%
Memory size610.2 KiB
2021-09-24T17:30:52.460334image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-108.5527385
5-th percentile-10.34601324
Q129.29506186
median54.61996394
Q372.48868553
95-th percentile103.5106456
Maximum190.0178353
Range298.5705737
Interquartile range (IQR)43.19362368

Descriptive statistics

Standard deviation32.69617297
Coefficient of variation (CV)0.6494337443
Kurtosis-0.2126514752
Mean50.34566383
Median Absolute Deviation (MAD)21.07649329
Skewness-0.3080754761
Sum3931794.962
Variance1069.039727
MonotonicityNot monotonic
2021-09-24T17:30:52.573072image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
69.911667313
 
< 0.1%
55.555395712
 
< 0.1%
82.247807932
 
< 0.1%
-19.866734372
 
< 0.1%
96.287105412
 
< 0.1%
-6.4462116012
 
< 0.1%
62.454381642
 
< 0.1%
36.034810062
 
< 0.1%
73.662635951
 
< 0.1%
92.702542571
 
< 0.1%
Other values (78077)78077
> 99.9%
ValueCountFrequency (%)
-108.55273851
< 0.1%
-98.47330811
< 0.1%
-96.568829291
< 0.1%
-91.849617921
< 0.1%
-83.18560581
< 0.1%
-81.953749291
< 0.1%
-80.928511721
< 0.1%
-80.912292261
< 0.1%
-80.583738771
< 0.1%
-80.310182151
< 0.1%
ValueCountFrequency (%)
190.01783531
< 0.1%
189.77803631
< 0.1%
163.32984041
< 0.1%
160.8509261
< 0.1%
157.5071651
< 0.1%
154.79230761
< 0.1%
154.5360441
< 0.1%
151.77474271
< 0.1%
151.58603461
< 0.1%
151.58402031
< 0.1%

Y0
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct78090
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean85.81205072
Minimum-98.23375643
Maximum169.1754637
Zeros1
Zeros (%)< 0.1%
Negative1020
Negative (%)1.3%
Memory size610.2 KiB
2021-09-24T17:30:52.688980image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-98.23375643
5-th percentile11.62690992
Q163.49443226
median86.52624635
Q3113.107355
95-th percentile147.3670991
Maximum169.1754637
Range267.4092201
Interquartile range (IQR)49.61292275

Descriptive statistics

Standard deviation40.20436291
Coefficient of variation (CV)0.4685165146
Kurtosis-0.3833101698
Mean85.81205072
Median Absolute Deviation (MAD)24.70686051
Skewness-0.2791227537
Sum6701577.913
Variance1616.390797
MonotonicityNot monotonic
2021-09-24T17:30:52.803991image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
93.925615813
 
< 0.1%
89.857483152
 
< 0.1%
88.677369522
 
< 0.1%
79.067476742
 
< 0.1%
86.013553982
 
< 0.1%
01
 
< 0.1%
84.130221161
 
< 0.1%
136.0606791
 
< 0.1%
31.936066131
 
< 0.1%
31.711667691
 
< 0.1%
Other values (78080)78080
> 99.9%
ValueCountFrequency (%)
-98.233756431
< 0.1%
-98.028199151
< 0.1%
-97.870026871
< 0.1%
-97.812204751
< 0.1%
-97.759069671
< 0.1%
-97.72449421
< 0.1%
-72.401967191
< 0.1%
-70.252290061
< 0.1%
-67.157904421
< 0.1%
-63.983171861
< 0.1%
ValueCountFrequency (%)
169.17546371
< 0.1%
168.88667491
< 0.1%
168.71745791
< 0.1%
168.68602951
< 0.1%
168.61648981
< 0.1%
168.5793511
< 0.1%
168.21699351
< 0.1%
167.98290181
< 0.1%
167.9818461
< 0.1%
167.9587431
< 0.1%

Z0
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct78090
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-29.98471184
Minimum-126.7708719
Maximum113.3451187
Zeros1
Zeros (%)< 0.1%
Negative59682
Negative (%)76.4%
Memory size610.2 KiB
2021-09-24T17:30:52.917223image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-126.7708719
5-th percentile-85.62163672
Q1-56.35643766
median-30.86412481
Q3-1.418802619
95-th percentile25.65224517
Maximum113.3451187
Range240.1159907
Interquartile range (IQR)54.93763504

Descriptive statistics

Standard deviation34.36191841
Coefficient of variation (CV)-1.145981279
Kurtosis-0.5844941882
Mean-29.98471184
Median Absolute Deviation (MAD)27.08476795
Skewness0.1156770151
Sum-2341686.056
Variance1180.741437
MonotonicityNot monotonic
2021-09-24T17:30:53.025259image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-7.7695138463
 
< 0.1%
24.928390122
 
< 0.1%
-15.908091522
 
< 0.1%
-24.200284442
 
< 0.1%
11.618245232
 
< 0.1%
01
 
< 0.1%
-40.03482431
 
< 0.1%
-14.701042541
 
< 0.1%
-52.048240791
 
< 0.1%
-51.98124351
 
< 0.1%
Other values (78080)78080
> 99.9%
ValueCountFrequency (%)
-126.77087191
< 0.1%
-126.70879411
< 0.1%
-126.53770371
< 0.1%
-126.50842541
< 0.1%
-126.47219041
< 0.1%
-126.43693211
< 0.1%
-126.40535361
< 0.1%
-126.36877191
< 0.1%
-126.30182861
< 0.1%
-126.23832181
< 0.1%
ValueCountFrequency (%)
113.34511871
< 0.1%
113.13066021
< 0.1%
109.52759521
< 0.1%
109.34955241
< 0.1%
108.02319631
< 0.1%
106.14801911
< 0.1%
105.07634971
< 0.1%
103.6617731
< 0.1%
103.28604761
< 0.1%
101.79194761
< 0.1%

X1
Real number (ℝ)

HIGH CORRELATION

Distinct78090
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.59520885
Minimum-111.6852412
Maximum188.6919966
Zeros1
Zeros (%)< 0.1%
Negative6951
Negative (%)8.9%
Memory size610.2 KiB
2021-09-24T17:30:53.131243image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-111.6852412
5-th percentile-10.28193604
Q128.75513657
median54.21551426
Q371.76203882
95-th percentile102.2078761
Maximum188.6919966
Range300.3772378
Interquartile range (IQR)43.00690225

Descriptive statistics

Standard deviation32.47823836
Coefficient of variation (CV)0.65486645
Kurtosis-0.2366674035
Mean49.59520885
Median Absolute Deviation (MAD)21.1077295
Skewness-0.3077319869
Sum3873187.431
Variance1054.835967
MonotonicityNot monotonic
2021-09-24T17:30:53.246851image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15.687163532
 
< 0.1%
74.860447652
 
< 0.1%
60.046613222
 
< 0.1%
18.749571412
 
< 0.1%
53.317586572
 
< 0.1%
54.933256512
 
< 0.1%
48.694896651
 
< 0.1%
8.9384472541
 
< 0.1%
92.880372161
 
< 0.1%
6.4527195351
 
< 0.1%
Other values (78080)78080
> 99.9%
ValueCountFrequency (%)
-111.68524121
< 0.1%
-106.00240381
< 0.1%
-99.868295651
< 0.1%
-99.52420381
< 0.1%
-99.4502271
< 0.1%
-97.264550031
< 0.1%
-84.977861051
< 0.1%
-83.511861841
< 0.1%
-82.214033781
< 0.1%
-82.173122181
< 0.1%
ValueCountFrequency (%)
188.69199661
< 0.1%
165.97541581
< 0.1%
158.78345351
< 0.1%
154.32229161
< 0.1%
153.7140131
< 0.1%
151.67870261
< 0.1%
151.54920591
< 0.1%
151.54416421
< 0.1%
151.27141321
< 0.1%
150.64426131
< 0.1%

Y1
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct78093
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean86.1926472
Minimum-96.14258939
Maximum170.2093504
Zeros1
Zeros (%)< 0.1%
Negative1443
Negative (%)1.8%
Memory size610.2 KiB
2021-09-24T17:30:53.360504image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-96.14258939
5-th percentile9.866469746
Q164.15452851
median87.54275093
Q3116.2193985
95-th percentile147.0656301
Maximum170.2093504
Range266.3519398
Interquartile range (IQR)52.06486995

Descriptive statistics

Standard deviation40.45321396
Coefficient of variation (CV)0.4693348595
Kurtosis-0.3470154403
Mean86.1926472
Median Absolute Deviation (MAD)24.72145868
Skewness-0.3541298874
Sum6731300.975
Variance1636.46252
MonotonicityNot monotonic
2021-09-24T17:30:53.477505image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
98.199758182
 
< 0.1%
79.024683752
 
< 0.1%
78.553406292
 
< 0.1%
96.366779961
 
< 0.1%
85.454295641
 
< 0.1%
32.409676891
 
< 0.1%
84.797062271
 
< 0.1%
84.585075991
 
< 0.1%
84.311840991
 
< 0.1%
31.535779971
 
< 0.1%
Other values (78083)78083
> 99.9%
ValueCountFrequency (%)
-96.142589391
< 0.1%
-88.787720441
< 0.1%
-88.186948351
< 0.1%
-87.912273331
< 0.1%
-87.575570271
< 0.1%
-75.570932241
< 0.1%
-65.758453391
< 0.1%
-63.20905461
< 0.1%
-61.203059591
< 0.1%
-59.392506041
< 0.1%
ValueCountFrequency (%)
170.20935041
< 0.1%
169.44024271
< 0.1%
168.33796411
< 0.1%
168.18569051
< 0.1%
168.07788851
< 0.1%
167.97143231
< 0.1%
167.93935641
< 0.1%
167.88570491
< 0.1%
167.85137651
< 0.1%
167.55944661
< 0.1%

Z1
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct78094
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-29.50920162
Minimum-166.0068383
Maximum104.6978523
Zeros1
Zeros (%)< 0.1%
Negative58887
Negative (%)75.4%
Memory size610.2 KiB
2021-09-24T17:30:53.588380image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-166.0068383
5-th percentile-85.08615466
Q1-57.36010689
median-30.18400462
Q3-0.3666919561
95-th percentile26.53806233
Maximum104.6978523
Range270.7046906
Interquartile range (IQR)56.99341493

Descriptive statistics

Standard deviation34.7643982
Coefficient of variation (CV)-1.178086709
Kurtosis-0.7280836807
Mean-29.50920162
Median Absolute Deviation (MAD)28.39320583
Skewness0.1077609169
Sum-2304550.609
Variance1208.563382
MonotonicityNot monotonic
2021-09-24T17:30:53.701545image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-48.858610682
 
< 0.1%
-0.3354955792
 
< 0.1%
01
 
< 0.1%
-37.677933071
 
< 0.1%
-51.484574271
 
< 0.1%
-40.02429391
 
< 0.1%
-40.706356241
 
< 0.1%
-41.188285021
 
< 0.1%
-51.76686011
 
< 0.1%
-12.714291141
 
< 0.1%
Other values (78084)78084
> 99.9%
ValueCountFrequency (%)
-166.00683831
< 0.1%
-130.06286611
< 0.1%
-129.32483551
< 0.1%
-126.32779451
< 0.1%
-126.29329731
< 0.1%
-126.18001241
< 0.1%
-125.75584851
< 0.1%
-125.60465381
< 0.1%
-125.35911431
< 0.1%
-124.99612821
< 0.1%
ValueCountFrequency (%)
104.69785231
< 0.1%
104.61811481
< 0.1%
104.60276741
< 0.1%
104.58936511
< 0.1%
104.49211021
< 0.1%
104.47144391
< 0.1%
104.46145381
< 0.1%
104.33481821
< 0.1%
104.26648691
< 0.1%
103.29367091
< 0.1%

X2
Real number (ℝ)

HIGH CORRELATION

Distinct78086
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48.61212116
Minimum-106.8865243
Maximum188.7601677
Zeros1
Zeros (%)< 0.1%
Negative8092
Negative (%)10.4%
Memory size610.2 KiB
2021-09-24T17:30:54.046859image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-106.8865243
5-th percentile-11.90228033
Q125.17000649
median53.81458013
Q371.56195062
95-th percentile103.3435483
Maximum188.7601677
Range295.646692
Interquartile range (IQR)46.39194413

Descriptive statistics

Standard deviation33.60538973
Coefficient of variation (CV)0.6912965105
Kurtosis-0.3898883791
Mean48.61212116
Median Absolute Deviation (MAD)22.17295182
Skewness-0.2695575064
Sum3796412.214
Variance1129.322219
MonotonicityNot monotonic
2021-09-24T17:30:54.163464image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19.315494543
 
< 0.1%
19.173167873
 
< 0.1%
-13.894989332
 
< 0.1%
12.67976672
 
< 0.1%
-11.527173092
 
< 0.1%
106.7751112
 
< 0.1%
85.820440042
 
< 0.1%
-14.142220112
 
< 0.1%
01
 
< 0.1%
9.0318525291
 
< 0.1%
Other values (78076)78076
> 99.9%
ValueCountFrequency (%)
-106.88652431
< 0.1%
-103.96191881
< 0.1%
-101.56528311
< 0.1%
-99.91745291
< 0.1%
-98.650631771
< 0.1%
-97.281170881
< 0.1%
-82.898428631
< 0.1%
-82.580036071
< 0.1%
-82.452400131
< 0.1%
-82.413682151
< 0.1%
ValueCountFrequency (%)
188.76016771
< 0.1%
163.54731021
< 0.1%
153.71693591
< 0.1%
152.64755071
< 0.1%
151.96896651
< 0.1%
151.41533781
< 0.1%
151.40575311
< 0.1%
151.22216631
< 0.1%
149.75659581
< 0.1%
149.62548731
< 0.1%

Y2
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct78089
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean83.77131466
Minimum-100.7893122
Maximum168.1864665
Zeros1
Zeros (%)< 0.1%
Negative1819
Negative (%)2.3%
Memory size610.2 KiB
2021-09-24T17:30:54.275845image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-100.7893122
5-th percentile7.860157543
Q158.0523851
median86.45832359
Q3106.6608272
95-th percentile145.66901
Maximum168.1864665
Range268.9757787
Interquartile range (IQR)48.60844212

Descriptive statistics

Standard deviation41.02354311
Coefficient of variation (CV)0.489708718
Kurtosis-0.4325056951
Mean83.77131466
Median Absolute Deviation (MAD)25.66366947
Skewness-0.329806004
Sum6542204.589
Variance1682.931089
MonotonicityNot monotonic
2021-09-24T17:30:54.392070image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
77.993601523
 
< 0.1%
111.98093642
 
< 0.1%
59.186161492
 
< 0.1%
42.32186112
 
< 0.1%
92.671070942
 
< 0.1%
129.25454142
 
< 0.1%
46.58850691
 
< 0.1%
136.0139761
 
< 0.1%
96.197718161
 
< 0.1%
96.278861881
 
< 0.1%
Other values (78079)78079
> 99.9%
ValueCountFrequency (%)
-100.78931221
< 0.1%
-99.481974781
< 0.1%
-96.171304551
< 0.1%
-89.972754091
< 0.1%
-89.452269521
< 0.1%
-69.61640251
< 0.1%
-64.83034241
< 0.1%
-63.466323651
< 0.1%
-60.076000561
< 0.1%
-60.05836651
< 0.1%
ValueCountFrequency (%)
168.18646651
< 0.1%
167.9734161
< 0.1%
167.88194671
< 0.1%
167.83001521
< 0.1%
167.37456561
< 0.1%
166.88037711
< 0.1%
166.87418941
< 0.1%
166.78572471
< 0.1%
166.71449361
< 0.1%
166.65055861
< 0.1%

Z2
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct78089
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-30.56051494
Minimum-129.5952959
Maximum104.590879
Zeros1
Zeros (%)< 0.1%
Negative59293
Negative (%)75.9%
Memory size610.2 KiB
2021-09-24T17:30:54.505029image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-129.5952959
5-th percentile-84.65637719
Q1-58.65405886
median-32.35241413
Q3-0.9447856182
95-th percentile27.13669177
Maximum104.590879
Range234.1861749
Interquartile range (IQR)57.70927324

Descriptive statistics

Standard deviation35.12032911
Coefficient of variation (CV)-1.149206065
Kurtosis-0.7728259869
Mean-30.56051494
Median Absolute Deviation (MAD)28.11877784
Skewness0.181835072
Sum-2386653.975
Variance1233.437517
MonotonicityNot monotonic
2021-09-24T17:30:54.612438image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-49.503369493
 
< 0.1%
-14.537927962
 
< 0.1%
-36.127229142
 
< 0.1%
-39.476659162
 
< 0.1%
2.6870140412
 
< 0.1%
-22.189132652
 
< 0.1%
-49.832948811
 
< 0.1%
-12.286038751
 
< 0.1%
-11.747659761
 
< 0.1%
-12.54962631
 
< 0.1%
Other values (78079)78079
> 99.9%
ValueCountFrequency (%)
-129.59529591
< 0.1%
-125.11386631
< 0.1%
-125.102121
< 0.1%
-124.51029291
< 0.1%
-124.03127411
< 0.1%
-123.44133161
< 0.1%
-123.09825831
< 0.1%
-121.10642081
< 0.1%
-120.41448451
< 0.1%
-119.9033861
< 0.1%
ValueCountFrequency (%)
104.5908791
< 0.1%
104.19315961
< 0.1%
103.29038141
< 0.1%
103.25251721
< 0.1%
103.04439141
< 0.1%
102.21122331
< 0.1%
102.19475881
< 0.1%
102.15305871
< 0.1%
101.72388181
< 0.1%
101.00988891
< 0.1%

X3
Categorical

HIGH CARDINALITY

Distinct77404
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
 
690
42.0229288803774
 
2
75.7791822770612
 
2
76.2877077845798
 
2
-3.38285395430093
 
1
Other values (77399)
77399 

Length

Max length20
Median length16
Mean length15.87492317
Min length1

Characters and Unicode

Total characters1239768
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique77400 ?
Unique (%)99.1%

Sample

1st row0
2nd row85.2322638852917
3rd row87.4508729469625
4th row86.8353875680762
5th row61.5961571288978

Common Values

ValueCountFrequency (%)
?690
 
0.9%
42.02292888037742
 
< 0.1%
75.77918227706122
 
< 0.1%
76.28770778457982
 
< 0.1%
-3.382853954300931
 
< 0.1%
70.56695633562321
 
< 0.1%
-3.189800619863171
 
< 0.1%
-3.407042232235791
 
< 0.1%
-3.566387087613311
 
< 0.1%
5.336514475654921
 
< 0.1%
Other values (77394)77394
99.1%

Length

2021-09-24T17:30:54.847427image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
690
 
0.9%
42.02292888037742
 
< 0.1%
75.77918227706122
 
< 0.1%
76.28770778457982
 
< 0.1%
3.382853954300931
 
< 0.1%
70.56695633562321
 
< 0.1%
3.189800619863171
 
< 0.1%
3.407042232235791
 
< 0.1%
3.566387087613311
 
< 0.1%
5.336514475654921
 
< 0.1%
Other values (77394)77394
99.1%

Most occurring characters

ValueCountFrequency (%)
1124241
10.0%
5121505
9.8%
6118544
9.6%
7115917
9.3%
3115873
9.3%
4115323
9.3%
2114480
9.2%
8113141
9.1%
9111378
9.0%
0102787
8.3%
Other values (3)86579
7.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1153189
93.0%
Other Punctuation78095
 
6.3%
Dash Punctuation8484
 
0.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1124241
10.8%
5121505
10.5%
6118544
10.3%
7115917
10.1%
3115873
10.0%
4115323
10.0%
2114480
9.9%
8113141
9.8%
9111378
9.7%
0102787
8.9%
Other Punctuation
ValueCountFrequency (%)
.77405
99.1%
?690
 
0.9%
Dash Punctuation
ValueCountFrequency (%)
-8484
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1239768
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1124241
10.0%
5121505
9.8%
6118544
9.6%
7115917
9.3%
3115873
9.3%
4115323
9.3%
2114480
9.2%
8113141
9.1%
9111378
9.0%
0102787
8.3%
Other values (3)86579
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1239768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1124241
10.0%
5121505
9.8%
6118544
9.6%
7115917
9.3%
3115873
9.3%
4115323
9.3%
2114480
9.2%
8113141
9.1%
9111378
9.0%
0102787
8.3%
Other values (3)86579
7.0%

Y3
Categorical

HIGH CARDINALITY

Distinct77405
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
 
690
59.0862637259168
 
2
95.8894937439639
 
2
49.6314335247607
 
1
92.0647309640474
 
1
Other values (77400)
77400 

Length

Max length21
Median length16
Mean length15.79299836
Min length1

Characters and Unicode

Total characters1233370
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique77402 ?
Unique (%)99.1%

Sample

1st row0
2nd row67.7492195028673
3rd row68.4008083028339
4th row68.9079249764243
5th row11.2506481750465

Common Values

ValueCountFrequency (%)
?690
 
0.9%
59.08626372591682
 
< 0.1%
95.88949374396392
 
< 0.1%
49.63143352476071
 
< 0.1%
92.06473096404741
 
< 0.1%
92.14126913958991
 
< 0.1%
92.14994675149241
 
< 0.1%
130.2550024180571
 
< 0.1%
49.79898310524051
 
< 0.1%
97.48154347675171
 
< 0.1%
Other values (77395)77395
99.1%

Length

2021-09-24T17:30:55.083094image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
690
 
0.9%
59.08626372591682
 
< 0.1%
95.88949374396392
 
< 0.1%
49.63143352476071
 
< 0.1%
92.06473096404741
 
< 0.1%
92.14126913958991
 
< 0.1%
92.14994675149241
 
< 0.1%
130.2550024180571
 
< 0.1%
49.79898310524051
 
< 0.1%
97.48154347675171
 
< 0.1%
Other values (77395)77395
99.1%

Most occurring characters

ValueCountFrequency (%)
1135458
11.0%
8117828
9.6%
3116617
9.5%
9115902
9.4%
4114153
9.3%
2113324
9.2%
7113176
9.2%
5113008
9.2%
6110538
9.0%
0103128
8.4%
Other values (3)80238
6.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1153132
93.5%
Other Punctuation78095
 
6.3%
Dash Punctuation2143
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1135458
11.7%
8117828
10.2%
3116617
10.1%
9115902
10.1%
4114153
9.9%
2113324
9.8%
7113176
9.8%
5113008
9.8%
6110538
9.6%
0103128
8.9%
Other Punctuation
ValueCountFrequency (%)
.77405
99.1%
?690
 
0.9%
Dash Punctuation
ValueCountFrequency (%)
-2143
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1233370
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1135458
11.0%
8117828
9.6%
3116617
9.5%
9115902
9.4%
4114153
9.3%
2113324
9.2%
7113176
9.2%
5113008
9.2%
6110538
9.0%
0103128
8.4%
Other values (3)80238
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII1233370
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1135458
11.0%
8117828
9.6%
3116617
9.5%
9115902
9.4%
4114153
9.3%
2113324
9.2%
7113176
9.2%
5113008
9.2%
6110538
9.0%
0103128
8.4%
Other values (3)80238
6.5%

Z3
Categorical

HIGH CARDINALITY

Distinct77405
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
 
690
-44.731361092816
 
2
-7.37234857290389
 
2
-41.9834955325056
 
1
1.01703148994075
 
1
Other values (77400)
77400 

Length

Max length20
Median length17
Mean length16.53174298
Min length1

Characters and Unicode

Total characters1291063
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique77402 ?
Unique (%)99.1%

Sample

1st row0
2nd row-73.684130041833
3rd row-70.703990925959
4th row-71.1383441365739
5th row-68.9564252307431

Common Values

ValueCountFrequency (%)
?690
 
0.9%
-44.7313610928162
 
< 0.1%
-7.372348572903892
 
< 0.1%
-41.98349553250561
 
< 0.1%
1.017031489940751
 
< 0.1%
0.8244341807311121
 
< 0.1%
0.7093417207914231
 
< 0.1%
4.562847391537311
 
< 0.1%
-42.42058227158771
 
< 0.1%
0.1778732844332471
 
< 0.1%
Other values (77395)77395
99.1%

Length

2021-09-24T17:30:55.313183image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
690
 
0.9%
44.7313610928162
 
< 0.1%
7.372348572903892
 
< 0.1%
41.98349553250561
 
< 0.1%
1.017031489940751
 
< 0.1%
0.8244341807311121
 
< 0.1%
0.7093417207914231
 
< 0.1%
4.562847391537311
 
< 0.1%
42.42058227158771
 
< 0.1%
0.1778732844332471
 
< 0.1%
Other values (77395)77395
99.1%

Most occurring characters

ValueCountFrequency (%)
1120793
9.4%
2119686
9.3%
5118854
9.2%
3118842
9.2%
4117145
9.1%
6116673
9.0%
7115340
8.9%
8114179
8.8%
9110544
8.6%
0101952
7.9%
Other values (3)137055
10.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1154008
89.4%
Other Punctuation78095
 
6.0%
Dash Punctuation58960
 
4.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1120793
10.5%
2119686
10.4%
5118854
10.3%
3118842
10.3%
4117145
10.2%
6116673
10.1%
7115340
10.0%
8114179
9.9%
9110544
9.6%
0101952
8.8%
Other Punctuation
ValueCountFrequency (%)
.77405
99.1%
?690
 
0.9%
Dash Punctuation
ValueCountFrequency (%)
-58960
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1291063
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1120793
9.4%
2119686
9.3%
5118854
9.2%
3118842
9.2%
4117145
9.1%
6116673
9.0%
7115340
8.9%
8114179
8.8%
9110544
8.6%
0101952
7.9%
Other values (3)137055
10.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII1291063
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1120793
9.4%
2119686
9.3%
5118854
9.2%
3118842
9.2%
4117145
9.1%
6116673
9.0%
7115340
8.9%
8114179
8.8%
9110544
8.6%
0101952
7.9%
Other values (3)137055
10.6%

X4
Categorical

HIGH CARDINALITY

Distinct74974
Distinct (%)96.0%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
 
3120
79.1681341895142
 
2
-20.0151569120977
 
2
77.5665498565998
 
2
23.6592359195969
 
1
Other values (74969)
74969 

Length

Max length21
Median length16
Mean length15.41315816
Min length1

Characters and Unicode

Total characters1203706
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique74970 ?
Unique (%)96.0%

Sample

1st row0
2nd row59.1885757027887
3rd row61.5874515532753
4th row61.6864271910576
5th row77.3872254123912

Common Values

ValueCountFrequency (%)
?3120
 
4.0%
79.16813418951422
 
< 0.1%
-20.01515691209772
 
< 0.1%
77.56654985659982
 
< 0.1%
23.65923591959691
 
< 0.1%
23.78097754141751
 
< 0.1%
38.30433802396141
 
< 0.1%
22.4281884981751
 
< 0.1%
84.1377544784121
 
< 0.1%
65.24682412520961
 
< 0.1%
Other values (74964)74964
96.0%

Length

2021-09-24T17:30:55.541326image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3120
 
4.0%
79.16813418951422
 
< 0.1%
20.01515691209772
 
< 0.1%
77.56654985659982
 
< 0.1%
23.65923591959691
 
< 0.1%
23.78097754141751
 
< 0.1%
38.30433802396141
 
< 0.1%
22.4281884981751
 
< 0.1%
84.1377544784121
 
< 0.1%
65.24682412520961
 
< 0.1%
Other values (74964)74964
96.0%

Most occurring characters

ValueCountFrequency (%)
1119614
9.9%
5117771
9.8%
6115227
9.6%
4112718
9.4%
7112264
9.3%
3111522
9.3%
2110818
9.2%
8109869
9.1%
9107924
9.0%
099448
8.3%
Other values (3)86531
7.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1117175
92.8%
Other Punctuation78095
 
6.5%
Dash Punctuation8436
 
0.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1119614
10.7%
5117771
10.5%
6115227
10.3%
4112718
10.1%
7112264
10.0%
3111522
10.0%
2110818
9.9%
8109869
9.8%
9107924
9.7%
099448
8.9%
Other Punctuation
ValueCountFrequency (%)
.74975
96.0%
?3120
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
-8436
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1203706
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1119614
9.9%
5117771
9.8%
6115227
9.6%
4112718
9.4%
7112264
9.3%
3111522
9.3%
2110818
9.2%
8109869
9.1%
9107924
9.0%
099448
8.3%
Other values (3)86531
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII1203706
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1119614
9.9%
5117771
9.8%
6115227
9.6%
4112718
9.4%
7112264
9.3%
3111522
9.3%
2110818
9.2%
8109869
9.1%
9107924
9.0%
099448
8.3%
Other values (3)86531
7.2%

Y4
Categorical

HIGH CARDINALITY

Distinct74974
Distinct (%)96.0%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
 
3120
43.0903473869482
 
2
41.9840030949135
 
2
139.297041166856
 
2
0
 
1
Other values (74969)
74969 

Length

Max length19
Median length16
Mean length15.33095165
Min length1

Characters and Unicode

Total characters1197286
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique74970 ?
Unique (%)96.0%

Sample

1st row0
2nd row10.6789364098231
3rd row11.7799190329758
4th row11.7934398850428
5th row42.7178334810919

Common Values

ValueCountFrequency (%)
?3120
 
4.0%
43.09034738694822
 
< 0.1%
41.98400309491352
 
< 0.1%
139.2970411668562
 
< 0.1%
01
 
< 0.1%
101.5130398391891
 
< 0.1%
97.34999778032051
 
< 0.1%
142.9216844600471
 
< 0.1%
97.58527448861961
 
< 0.1%
131.8850561182631
 
< 0.1%
Other values (74964)74964
96.0%

Length

2021-09-24T17:30:55.769264image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3120
 
4.0%
43.09034738694822
 
< 0.1%
41.98400309491352
 
< 0.1%
139.2970411668562
 
< 0.1%
01
 
< 0.1%
101.5130398391891
 
< 0.1%
97.34999778032051
 
< 0.1%
142.9216844600471
 
< 0.1%
97.58527448861961
 
< 0.1%
131.8850561182631
 
< 0.1%
Other values (74964)74964
96.0%

Most occurring characters

ValueCountFrequency (%)
1133023
11.1%
9113213
9.5%
8112562
9.4%
3112345
9.4%
4111361
9.3%
2110267
9.2%
5109074
9.1%
7108750
9.1%
6106045
8.9%
0100258
8.4%
Other values (3)80388
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1116898
93.3%
Other Punctuation78095
 
6.5%
Dash Punctuation2293
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1133023
11.9%
9113213
10.1%
8112562
10.1%
3112345
10.1%
4111361
10.0%
2110267
9.9%
5109074
9.8%
7108750
9.7%
6106045
9.5%
0100258
9.0%
Other Punctuation
ValueCountFrequency (%)
.74975
96.0%
?3120
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
-2293
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1197286
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1133023
11.1%
9113213
9.5%
8112562
9.4%
3112345
9.4%
4111361
9.3%
2110267
9.2%
5109074
9.1%
7108750
9.1%
6106045
8.9%
0100258
8.4%
Other values (3)80388
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII1197286
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1133023
11.1%
9113213
9.5%
8112562
9.4%
3112345
9.4%
4111361
9.3%
2110267
9.2%
5109074
9.1%
7108750
9.1%
6106045
8.9%
0100258
8.4%
Other values (3)80388
6.7%

Z4
Categorical

HIGH CARDINALITY

Distinct74975
Distinct (%)96.0%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
 
3120
-71.1013128660108
 
2
-72.8833791767483
 
2
-1.11613099270773
 
1
-0.406177249212901
 
1
Other values (74970)
74970 

Length

Max length21
Median length17
Mean length16.05045073
Min length1

Characters and Unicode

Total characters1253476
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique74972 ?
Unique (%)96.0%

Sample

1st row0
2nd row-71.2977813147725
3rd row-68.827417756239
4th row-68.88931646056
5th row-72.0151462991019

Common Values

ValueCountFrequency (%)
?3120
 
4.0%
-71.10131286601082
 
< 0.1%
-72.88337917674832
 
< 0.1%
-1.116130992707731
 
< 0.1%
-0.4061772492129011
 
< 0.1%
7.360476125657541
 
< 0.1%
-1.233233932785561
 
< 0.1%
-11.51850464017561
 
< 0.1%
2.093571490162351
 
< 0.1%
-3.192981824525461
 
< 0.1%
Other values (74965)74965
96.0%

Length

2021-09-24T17:30:56.008038image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3120
 
4.0%
71.10131286601082
 
< 0.1%
72.88337917674832
 
< 0.1%
1.116130992707731
 
< 0.1%
0.4061772492129011
 
< 0.1%
7.360476125657541
 
< 0.1%
1.233233932785561
 
< 0.1%
11.51850464017561
 
< 0.1%
2.093571490162351
 
< 0.1%
3.192981824525461
 
< 0.1%
Other values (74965)74965
96.0%

Most occurring characters

ValueCountFrequency (%)
1117109
9.3%
2115992
9.3%
5115405
9.2%
3114803
9.2%
4113037
9.0%
6112905
9.0%
7111253
8.9%
8110899
8.8%
9107366
8.6%
099055
7.9%
Other values (3)135652
10.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1117824
89.2%
Other Punctuation78095
 
6.2%
Dash Punctuation57557
 
4.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1117109
10.5%
2115992
10.4%
5115405
10.3%
3114803
10.3%
4113037
10.1%
6112905
10.1%
7111253
10.0%
8110899
9.9%
9107366
9.6%
099055
8.9%
Other Punctuation
ValueCountFrequency (%)
.74975
96.0%
?3120
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
-57557
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1253476
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1117109
9.3%
2115992
9.3%
5115405
9.2%
3114803
9.2%
4113037
9.0%
6112905
9.0%
7111253
8.9%
8110899
8.8%
9107366
8.6%
099055
7.9%
Other values (3)135652
10.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII1253476
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1117109
9.3%
2115992
9.3%
5115405
9.2%
3114803
9.2%
4113037
9.0%
6112905
9.0%
7111253
8.9%
8110899
8.8%
9107366
8.6%
099055
7.9%
Other values (3)135652
10.8%

X5
Categorical

HIGH CARDINALITY

Distinct65074
Distinct (%)83.3%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
13023 
8.83793809199762
 
1
47.9669093883694
 
1
47.9606888522214
 
1
8.40508208999718
 
1
Other values (65069)
65069 

Length

Max length20
Median length16
Mean length13.51559619
Min length1

Characters and Unicode

Total characters1055514
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique65073 ?
Unique (%)83.3%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?13023
 
16.7%
8.837938091997621
 
< 0.1%
47.96690938836941
 
< 0.1%
47.96068885222141
 
< 0.1%
8.405082089997181
 
< 0.1%
110.7019942081431
 
< 0.1%
48.06801413466511
 
< 0.1%
8.01650740447311
 
< 0.1%
7.147711510210971
 
< 0.1%
110.0211830440751
 
< 0.1%
Other values (65064)65064
83.3%

Length

2021-09-24T17:30:56.239474image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
13023
 
16.7%
8.837938091997621
 
< 0.1%
47.96690938836941
 
< 0.1%
47.96068885222141
 
< 0.1%
8.405082089997181
 
< 0.1%
110.7019942081431
 
< 0.1%
48.06801413466511
 
< 0.1%
8.01650740447311
 
< 0.1%
7.147711510210971
 
< 0.1%
110.0211830440751
 
< 0.1%
Other values (65064)65064
83.3%

Most occurring characters

ValueCountFrequency (%)
1104776
9.9%
5101703
9.6%
498876
9.4%
698647
9.3%
798123
9.3%
297179
9.2%
396326
9.1%
895427
9.0%
992914
8.8%
085803
8.1%
Other values (3)85740
8.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number969774
91.9%
Other Punctuation78095
 
7.4%
Dash Punctuation7645
 
0.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1104776
10.8%
5101703
10.5%
498876
10.2%
698647
10.2%
798123
10.1%
297179
10.0%
396326
9.9%
895427
9.8%
992914
9.6%
085803
8.8%
Other Punctuation
ValueCountFrequency (%)
.65072
83.3%
?13023
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
-7645
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1055514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1104776
9.9%
5101703
9.6%
498876
9.4%
698647
9.3%
798123
9.3%
297179
9.2%
396326
9.1%
895427
9.0%
992914
8.8%
085803
8.1%
Other values (3)85740
8.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1055514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1104776
9.9%
5101703
9.6%
498876
9.4%
698647
9.3%
798123
9.3%
297179
9.2%
396326
9.1%
895427
9.0%
992914
8.8%
085803
8.1%
Other values (3)85740
8.1%

Y5
Categorical

HIGH CARDINALITY

Distinct65072
Distinct (%)83.3%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
13023 
86.0046621603218
 
2
101.201915490641
 
2
135.97320926227
 
1
96.637770018805
 
1
Other values (65067)
65067 

Length

Max length20
Median length16
Mean length13.43639879
Min length1

Characters and Unicode

Total characters1049329
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique65069 ?
Unique (%)83.3%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?13023
 
16.7%
86.00466216032182
 
< 0.1%
101.2019154906412
 
< 0.1%
135.973209262271
 
< 0.1%
96.6377700188051
 
< 0.1%
96.39353976911611
 
< 0.1%
96.38917742647841
 
< 0.1%
96.4320863221
 
< 0.1%
84.35350710232631
 
< 0.1%
46.58654669439281
 
< 0.1%
Other values (65062)65062
83.3%

Length

2021-09-24T17:30:56.468017image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
13023
 
16.7%
86.00466216032182
 
< 0.1%
101.2019154906412
 
< 0.1%
135.973209262271
 
< 0.1%
96.6377700188051
 
< 0.1%
96.39353976911611
 
< 0.1%
96.38917742647841
 
< 0.1%
96.4320863221
 
< 0.1%
84.35350710232631
 
< 0.1%
46.58654669439281
 
< 0.1%
Other values (65062)65062
83.3%

Most occurring characters

ValueCountFrequency (%)
1117175
11.2%
999021
9.4%
397924
9.3%
896352
9.2%
496159
9.2%
295712
9.1%
594018
9.0%
793441
8.9%
691679
8.7%
087704
8.4%
Other values (4)80144
7.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number969185
92.4%
Other Punctuation78095
 
7.4%
Dash Punctuation2048
 
0.2%
Uppercase Letter1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1117175
12.1%
999021
10.2%
397924
10.1%
896352
9.9%
496159
9.9%
295712
9.9%
594018
9.7%
793441
9.6%
691679
9.5%
087704
9.0%
Other Punctuation
ValueCountFrequency (%)
.65072
83.3%
?13023
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
-2048
100.0%
Uppercase Letter
ValueCountFrequency (%)
E1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1049328
> 99.9%
Latin1
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1117175
11.2%
999021
9.4%
397924
9.3%
896352
9.2%
496159
9.2%
295712
9.1%
594018
9.0%
793441
8.9%
691679
8.7%
087704
8.4%
Other values (3)80143
7.6%
Latin
ValueCountFrequency (%)
E1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1049329
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1117175
11.2%
999021
9.4%
397924
9.3%
896352
9.2%
496159
9.2%
295712
9.1%
594018
9.0%
793441
8.9%
691679
8.7%
087704
8.4%
Other values (4)80144
7.6%

Z5
Categorical

HIGH CARDINALITY

Distinct65074
Distinct (%)83.3%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
13023 
-36.8945196682354
 
1
-12.1537413670101
 
1
-12.4157018730824
 
1
-38.0090137197266
 
1
Other values (65069)
65069 

Length

Max length21
Median length17
Mean length14.04604589
Min length1

Characters and Unicode

Total characters1096940
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique65073 ?
Unique (%)83.3%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?13023
 
16.7%
-36.89451966823541
 
< 0.1%
-12.15374136701011
 
< 0.1%
-12.41570187308241
 
< 0.1%
-38.00901371972661
 
< 0.1%
-49.50669339627681
 
< 0.1%
-12.11841235710651
 
< 0.1%
-38.27781085943021
 
< 0.1%
-38.91746997467261
 
< 0.1%
-50.66524849982761
 
< 0.1%
Other values (65064)65064
83.3%

Length

2021-09-24T17:30:56.694073image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
13023
 
16.7%
36.89451966823541
 
< 0.1%
12.15374136701011
 
< 0.1%
12.41570187308241
 
< 0.1%
38.00901371972661
 
< 0.1%
49.50669339627681
 
< 0.1%
12.11841235710651
 
< 0.1%
38.27781085943021
 
< 0.1%
38.91746997467261
 
< 0.1%
50.66524849982761
 
< 0.1%
Other values (65064)65064
83.3%

Most occurring characters

ValueCountFrequency (%)
1103010
9.4%
2100510
9.2%
399593
9.1%
599243
9.0%
498385
9.0%
696893
8.8%
896605
8.8%
795888
8.7%
993989
8.6%
085999
7.8%
Other values (4)126825
11.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number970115
88.4%
Other Punctuation78095
 
7.1%
Dash Punctuation48729
 
4.4%
Uppercase Letter1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1103010
10.6%
2100510
10.4%
399593
10.3%
599243
10.2%
498385
10.1%
696893
10.0%
896605
10.0%
795888
9.9%
993989
9.7%
085999
8.9%
Other Punctuation
ValueCountFrequency (%)
.65072
83.3%
?13023
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
-48729
100.0%
Uppercase Letter
ValueCountFrequency (%)
E1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1096939
> 99.9%
Latin1
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1103010
9.4%
2100510
9.2%
399593
9.1%
599243
9.0%
498385
9.0%
696893
8.8%
896605
8.8%
795888
8.7%
993989
8.6%
085999
7.8%
Other values (3)126824
11.6%
Latin
ValueCountFrequency (%)
E1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1096940
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1103010
9.4%
2100510
9.2%
399593
9.1%
599243
9.0%
498385
9.0%
696893
8.8%
896605
8.8%
795888
8.7%
993989
8.6%
085999
7.8%
Other values (4)126825
11.6%

X6
Categorical

HIGH CARDINALITY

Distinct52247
Distinct (%)66.9%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
25848 
75.0765407117555
 
3
13.2024872848435
 
1
45.3947074178752
 
1
40.0141447532884
 
1
Other values (52242)
52242 

Length

Max length20
Median length16
Mean length11.05943966
Min length1

Characters and Unicode

Total characters863698
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52245 ?
Unique (%)66.9%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?25848
33.1%
75.07654071175553
 
< 0.1%
13.20248728484351
 
< 0.1%
45.39470741787521
 
< 0.1%
40.01414475328841
 
< 0.1%
56.97217212441691
 
< 0.1%
23.69869681439921
 
< 0.1%
35.46008054459791
 
< 0.1%
62.54426831336221
 
< 0.1%
68.43433253791421
 
< 0.1%
Other values (52237)52237
66.9%

Length

2021-09-24T17:30:56.912988image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
25848
33.1%
75.07654071175553
 
< 0.1%
13.20248728484351
 
< 0.1%
45.39470741787521
 
< 0.1%
40.01414475328841
 
< 0.1%
56.97217212441691
 
< 0.1%
23.69869681439921
 
< 0.1%
35.46008054459791
 
< 0.1%
62.54426831336221
 
< 0.1%
68.43433253791421
 
< 0.1%
Other values (52237)52237
66.9%

Most occurring characters

ValueCountFrequency (%)
184956
9.8%
579980
9.3%
679559
9.2%
778709
9.1%
278137
9.0%
877861
9.0%
477582
9.0%
376730
8.9%
974685
8.6%
070426
8.2%
Other values (3)85073
9.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number778625
90.2%
Other Punctuation78095
 
9.0%
Dash Punctuation6978
 
0.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
184956
10.9%
579980
10.3%
679559
10.2%
778709
10.1%
278137
10.0%
877861
10.0%
477582
10.0%
376730
9.9%
974685
9.6%
070426
9.0%
Other Punctuation
ValueCountFrequency (%)
.52247
66.9%
?25848
33.1%
Dash Punctuation
ValueCountFrequency (%)
-6978
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common863698
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
184956
9.8%
579980
9.3%
679559
9.2%
778709
9.1%
278137
9.0%
877861
9.0%
477582
9.0%
376730
8.9%
974685
8.6%
070426
8.2%
Other values (3)85073
9.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII863698
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
184956
9.8%
579980
9.3%
679559
9.2%
778709
9.1%
278137
9.0%
877861
9.0%
477582
9.0%
376730
8.9%
974685
8.6%
070426
8.2%
Other values (3)85073
9.8%

Y6
Categorical

HIGH CARDINALITY

Distinct52246
Distinct (%)66.9%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
25848 
58.9437787796554
 
3
60.9505325096857
 
2
143.792897243129
 
1
111.403179839045
 
1
Other values (52241)
52241 

Length

Max length19
Median length16
Mean length10.99580004
Min length1

Characters and Unicode

Total characters858728
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52243 ?
Unique (%)66.9%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?25848
33.1%
58.94377877965543
 
< 0.1%
60.95053250968572
 
< 0.1%
143.7928972431291
 
< 0.1%
111.4031798390451
 
< 0.1%
104.1206806384261
 
< 0.1%
96.46541066633251
 
< 0.1%
121.7614803605941
 
< 0.1%
105.5430782186021
 
< 0.1%
123.7247988998511
 
< 0.1%
Other values (52236)52236
66.9%

Length

2021-09-24T17:30:57.127191image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
25848
33.1%
58.94377877965543
 
< 0.1%
60.95053250968572
 
< 0.1%
143.7928972431291
 
< 0.1%
111.4031798390451
 
< 0.1%
104.1206806384261
 
< 0.1%
96.46541066633251
 
< 0.1%
121.7614803605941
 
< 0.1%
105.5430782186021
 
< 0.1%
123.7247988998511
 
< 0.1%
Other values (52236)52236
66.9%

Most occurring characters

ValueCountFrequency (%)
195280
11.1%
378917
9.2%
978785
9.2%
477526
9.0%
877493
9.0%
276467
8.9%
575315
8.8%
774167
8.6%
673334
8.5%
071097
8.3%
Other values (3)80347
9.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number778381
90.6%
Other Punctuation78095
 
9.1%
Dash Punctuation2252
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
195280
12.2%
378917
10.1%
978785
10.1%
477526
10.0%
877493
10.0%
276467
9.8%
575315
9.7%
774167
9.5%
673334
9.4%
071097
9.1%
Other Punctuation
ValueCountFrequency (%)
.52247
66.9%
?25848
33.1%
Dash Punctuation
ValueCountFrequency (%)
-2252
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common858728
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
195280
11.1%
378917
9.2%
978785
9.2%
477526
9.0%
877493
9.0%
276467
8.9%
575315
8.8%
774167
8.6%
673334
8.5%
071097
8.3%
Other values (3)80347
9.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII858728
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
195280
11.1%
378917
9.2%
978785
9.2%
477526
9.0%
877493
9.0%
276467
8.9%
575315
8.8%
774167
8.6%
673334
8.5%
071097
8.3%
Other values (3)80347
9.4%

Z6
Categorical

HIGH CARDINALITY

Distinct52247
Distinct (%)66.9%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
25848 
-46.6418085995195
 
3
-41.7091782810854
 
1
-68.8564859765744
 
1
-28.2984352065888
 
1
Other values (52242)
52242 

Length

Max length20
Median length16
Mean length11.46170098
Min length1

Characters and Unicode

Total characters895113
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52245 ?
Unique (%)66.9%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?25848
33.1%
-46.64180859951953
 
< 0.1%
-41.70917828108541
 
< 0.1%
-68.85648597657441
 
< 0.1%
-28.29843520658881
 
< 0.1%
-35.96183789957191
 
< 0.1%
-77.22597616781
 
< 0.1%
-22.59107257130181
 
< 0.1%
-59.31825286564911
 
< 0.1%
-57.14239544993091
 
< 0.1%
Other values (52237)52237
66.9%

Length

2021-09-24T17:30:57.344680image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
25848
33.1%
46.64180859951953
 
< 0.1%
41.70917828108541
 
< 0.1%
68.85648597657441
 
< 0.1%
28.29843520658881
 
< 0.1%
35.96183789957191
 
< 0.1%
77.22597616781
 
< 0.1%
22.59107257130181
 
< 0.1%
59.31825286564911
 
< 0.1%
57.14239544993091
 
< 0.1%
Other values (52237)52237
66.9%

Most occurring characters

ValueCountFrequency (%)
184129
9.4%
282299
9.2%
379277
8.9%
479046
8.8%
578825
8.8%
677707
8.7%
877593
8.7%
776544
8.6%
974684
8.3%
068963
7.7%
Other values (3)116046
13.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number779067
87.0%
Other Punctuation78095
 
8.7%
Dash Punctuation37951
 
4.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
184129
10.8%
282299
10.6%
379277
10.2%
479046
10.1%
578825
10.1%
677707
10.0%
877593
10.0%
776544
9.8%
974684
9.6%
068963
8.9%
Other Punctuation
ValueCountFrequency (%)
.52247
66.9%
?25848
33.1%
Dash Punctuation
ValueCountFrequency (%)
-37951
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common895113
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
184129
9.4%
282299
9.2%
379277
8.9%
479046
8.8%
578825
8.8%
677707
8.7%
877593
8.7%
776544
8.6%
974684
8.3%
068963
7.7%
Other values (3)116046
13.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII895113
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
184129
9.4%
282299
9.2%
379277
8.9%
479046
8.8%
578825
8.8%
677707
8.7%
877593
8.7%
776544
8.6%
974684
8.3%
068963
7.7%
Other values (3)116046
13.0%

X7
Categorical

HIGH CARDINALITY

Distinct38943
Distinct (%)49.9%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
39152 
71.4715886422563
 
2
27.5902714389679
 
2
0
 
1
81.3730140861057
 
1
Other values (38938)
38938 

Length

Max length20
Median length1
Mean length8.507285905
Min length1

Characters and Unicode

Total characters664385
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38940 ?
Unique (%)49.9%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?39152
50.1%
71.47158864225632
 
< 0.1%
27.59027143896792
 
< 0.1%
01
 
< 0.1%
81.37301408610571
 
< 0.1%
86.97166570863931
 
< 0.1%
70.82399247272791
 
< 0.1%
70.99091504985281
 
< 0.1%
70.58455209702931
 
< 0.1%
70.65386874428541
 
< 0.1%
Other values (38933)38933
49.9%

Length

2021-09-24T17:30:57.580603image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
39152
50.1%
71.47158864225632
 
< 0.1%
27.59027143896792
 
< 0.1%
01
 
< 0.1%
81.37301408610571
 
< 0.1%
86.97166570863931
 
< 0.1%
70.82399247272791
 
< 0.1%
70.99091504985281
 
< 0.1%
70.58455209702931
 
< 0.1%
70.65386874428541
 
< 0.1%
Other values (38933)38933
49.9%

Most occurring characters

ValueCountFrequency (%)
164061
9.6%
259575
9.0%
758505
8.8%
558441
8.8%
858269
8.8%
658118
8.7%
457353
8.6%
357135
8.6%
956461
8.5%
052437
7.9%
Other values (3)84030
12.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number580355
87.4%
Other Punctuation78095
 
11.8%
Dash Punctuation5935
 
0.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
164061
11.0%
259575
10.3%
758505
10.1%
558441
10.1%
858269
10.0%
658118
10.0%
457353
9.9%
357135
9.8%
956461
9.7%
052437
9.0%
Other Punctuation
ValueCountFrequency (%)
?39152
50.1%
.38943
49.9%
Dash Punctuation
ValueCountFrequency (%)
-5935
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common664385
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
164061
9.6%
259575
9.0%
758505
8.8%
558441
8.8%
858269
8.8%
658118
8.7%
457353
8.6%
357135
8.6%
956461
8.5%
052437
7.9%
Other values (3)84030
12.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII664385
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
164061
9.6%
259575
9.0%
758505
8.8%
558441
8.8%
858269
8.8%
658118
8.7%
457353
8.6%
357135
8.6%
956461
8.5%
052437
7.9%
Other values (3)84030
12.6%

Y7
Categorical

HIGH CARDINALITY

Distinct38944
Distinct (%)49.9%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
39152 
68.4080745074578
 
2
10.7888882722269
 
1
95.8534381622457
 
1
143.837117061038
 
1
Other values (38939)
38939 

Length

Max length21
Median length1
Mean length8.438024995
Min length1

Characters and Unicode

Total characters658976
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38942 ?
Unique (%)49.9%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?39152
50.1%
68.40807450745782
 
< 0.1%
10.78888827222691
 
< 0.1%
95.85343816224571
 
< 0.1%
143.8371170610381
 
< 0.1%
33.51275802064691
 
< 0.1%
10.61342363771881
 
< 0.1%
10.6779828531551
 
< 0.1%
10.70163742422061
 
< 0.1%
95.89181781799151
 
< 0.1%
Other values (38934)38934
49.9%

Length

2021-09-24T17:30:57.818757image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
39152
50.1%
68.40807450745782
 
< 0.1%
10.78888827222691
 
< 0.1%
95.85343816224571
 
< 0.1%
143.8371170610381
 
< 0.1%
33.51275802064691
 
< 0.1%
10.61342363771881
 
< 0.1%
10.6779828531551
 
< 0.1%
10.70163742422061
 
< 0.1%
95.89181781799151
 
< 0.1%
Other values (38934)38934
49.9%

Most occurring characters

ValueCountFrequency (%)
171254
10.8%
958789
8.9%
358220
8.8%
857598
8.7%
257479
8.7%
457465
8.7%
556324
8.5%
755517
8.4%
654406
8.3%
052970
8.0%
Other values (3)78954
12.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number580022
88.0%
Other Punctuation78095
 
11.9%
Dash Punctuation859
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
171254
12.3%
958789
10.1%
358220
10.0%
857598
9.9%
257479
9.9%
457465
9.9%
556324
9.7%
755517
9.6%
654406
9.4%
052970
9.1%
Other Punctuation
ValueCountFrequency (%)
?39152
50.1%
.38943
49.9%
Dash Punctuation
ValueCountFrequency (%)
-859
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common658976
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
171254
10.8%
958789
8.9%
358220
8.8%
857598
8.7%
257479
8.7%
457465
8.7%
556324
8.5%
755517
8.4%
654406
8.3%
052970
8.0%
Other values (3)78954
12.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII658976
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
171254
10.8%
958789
8.9%
358220
8.8%
857598
8.7%
257479
8.7%
457465
8.7%
556324
8.5%
755517
8.4%
654406
8.3%
052970
8.0%
Other values (3)78954
12.0%

Z7
Categorical

HIGH CARDINALITY

Distinct38944
Distinct (%)49.9%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
39152 
52.9262198038884
 
2
-49.4095486841017
 
1
-11.6654461652288
 
1
7.10437002104889
 
1
Other values (38939)
38939 

Length

Max length20
Median length1
Mean length8.779540565
Min length1

Characters and Unicode

Total characters685647
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38942 ?
Unique (%)49.9%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?39152
50.1%
52.92621980388842
 
< 0.1%
-49.40954868410171
 
< 0.1%
-11.66544616522881
 
< 0.1%
7.104370021048891
 
< 0.1%
-46.89832258562841
 
< 0.1%
-49.39362991018491
 
< 0.1%
-49.30012138031941
 
< 0.1%
-49.42012775714041
 
< 0.1%
-11.53909414886821
 
< 0.1%
Other values (38934)38934
49.9%

Length

2021-09-24T17:30:58.056820image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
39152
50.1%
52.92621980388842
 
< 0.1%
49.40954868410171
 
< 0.1%
11.66544616522881
 
< 0.1%
7.104370021048891
 
< 0.1%
46.89832258562841
 
< 0.1%
49.39362991018491
 
< 0.1%
49.30012138031941
 
< 0.1%
49.42012775714041
 
< 0.1%
11.53909414886821
 
< 0.1%
Other values (38934)38934
49.9%

Most occurring characters

ValueCountFrequency (%)
163751
9.3%
262368
9.1%
459729
8.7%
359255
8.6%
558033
8.5%
657212
8.3%
757067
8.3%
856300
8.2%
955592
8.1%
051514
7.5%
Other values (3)104826
15.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number580821
84.7%
Other Punctuation78095
 
11.4%
Dash Punctuation26731
 
3.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
163751
11.0%
262368
10.7%
459729
10.3%
359255
10.2%
558033
10.0%
657212
9.9%
757067
9.8%
856300
9.7%
955592
9.6%
051514
8.9%
Other Punctuation
ValueCountFrequency (%)
?39152
50.1%
.38943
49.9%
Dash Punctuation
ValueCountFrequency (%)
-26731
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common685647
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
163751
9.3%
262368
9.1%
459729
8.7%
359255
8.6%
558033
8.5%
657212
8.3%
757067
8.3%
856300
8.2%
955592
8.1%
051514
7.5%
Other values (3)104826
15.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII685647
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
163751
9.3%
262368
9.1%
459729
8.7%
359255
8.6%
558033
8.5%
657212
8.3%
757067
8.3%
856300
8.2%
955592
8.1%
051514
7.5%
Other values (3)104826
15.3%

X8
Categorical

HIGH CARDINALITY

Distinct30563
Distinct (%)39.1%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
47532 
13.7982341188599
 
2
78.9698252731303
 
2
0
 
1
4.22269789279762
 
1
Other values (30558)
30558 

Length

Max length19
Median length1
Mean length6.888291334
Min length1

Characters and Unicode

Total characters537948
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30560 ?
Unique (%)39.1%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?47532
60.9%
13.79823411885992
 
< 0.1%
78.96982527313032
 
< 0.1%
01
 
< 0.1%
4.222697892797621
 
< 0.1%
-4.341758493782331
 
< 0.1%
-4.211242047289991
 
< 0.1%
-4.078846573985391
 
< 0.1%
85.29678212879041
 
< 0.1%
-3.808784256099291
 
< 0.1%
Other values (30553)30553
39.1%

Length

2021-09-24T17:30:58.274283image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
47532
60.9%
13.79823411885992
 
< 0.1%
78.96982527313032
 
< 0.1%
01
 
< 0.1%
4.222697892797621
 
< 0.1%
4.341758493782331
 
< 0.1%
4.211242047289991
 
< 0.1%
4.078846573985391
 
< 0.1%
85.29678212879041
 
< 0.1%
3.808784256099291
 
< 0.1%
Other values (30553)30553
39.1%

Most occurring characters

ValueCountFrequency (%)
148775
9.1%
?47532
8.8%
846626
8.7%
646518
8.6%
246252
8.6%
745873
8.5%
345357
8.4%
544922
8.4%
444889
8.3%
944759
8.3%
Other values (3)76445
14.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number455391
84.7%
Other Punctuation78095
 
14.5%
Dash Punctuation4462
 
0.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
148775
10.7%
846626
10.2%
646518
10.2%
246252
10.2%
745873
10.1%
345357
10.0%
544922
9.9%
444889
9.9%
944759
9.8%
041420
9.1%
Other Punctuation
ValueCountFrequency (%)
?47532
60.9%
.30563
39.1%
Dash Punctuation
ValueCountFrequency (%)
-4462
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common537948
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
148775
9.1%
?47532
8.8%
846626
8.7%
646518
8.6%
246252
8.6%
745873
8.5%
345357
8.4%
544922
8.4%
444889
8.3%
944759
8.3%
Other values (3)76445
14.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII537948
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
148775
9.1%
?47532
8.8%
846626
8.7%
646518
8.6%
246252
8.6%
745873
8.5%
345357
8.4%
544922
8.4%
444889
8.3%
944759
8.3%
Other values (3)76445
14.2%

Y8
Categorical

HIGH CARDINALITY

Distinct30563
Distinct (%)39.1%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
47532 
146.527510097484
 
2
92.3883424118537
 
2
0
 
1
130.488436813894
 
1
Other values (30558)
30558 

Length

Max length20
Median length1
Mean length6.827840094
Min length1

Characters and Unicode

Total characters533227
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30560 ?
Unique (%)39.1%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?47532
60.9%
146.5275100974842
 
< 0.1%
92.38834241185372
 
< 0.1%
01
 
< 0.1%
130.4884368138941
 
< 0.1%
92.62694071788141
 
< 0.1%
92.62230179526591
 
< 0.1%
92.62554862025871
 
< 0.1%
33.36903817706951
 
< 0.1%
92.64965197936041
 
< 0.1%
Other values (30553)30553
39.1%

Length

2021-09-24T17:30:58.481883image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
47532
60.9%
146.5275100974842
 
< 0.1%
92.38834241185372
 
< 0.1%
01
 
< 0.1%
130.4884368138941
 
< 0.1%
92.62694071788141
 
< 0.1%
92.62230179526591
 
< 0.1%
92.62554862025871
 
< 0.1%
33.36903817706951
 
< 0.1%
92.64965197936041
 
< 0.1%
Other values (30553)30553
39.1%

Most occurring characters

ValueCountFrequency (%)
155472
10.4%
?47532
8.9%
346177
8.7%
945650
8.6%
445505
8.5%
844996
8.4%
244757
8.4%
544471
8.3%
743384
8.1%
643298
8.1%
Other values (3)71985
13.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number455076
85.3%
Other Punctuation78095
 
14.6%
Dash Punctuation56
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
155472
12.2%
346177
10.1%
945650
10.0%
445505
10.0%
844996
9.9%
244757
9.8%
544471
9.8%
743384
9.5%
643298
9.5%
041366
9.1%
Other Punctuation
ValueCountFrequency (%)
?47532
60.9%
.30563
39.1%
Dash Punctuation
ValueCountFrequency (%)
-56
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common533227
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
155472
10.4%
?47532
8.9%
346177
8.7%
945650
8.6%
445505
8.5%
844996
8.4%
244757
8.4%
544471
8.3%
743384
8.1%
643298
8.1%
Other values (3)71985
13.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII533227
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
155472
10.4%
?47532
8.9%
346177
8.7%
945650
8.6%
445505
8.5%
844996
8.4%
244757
8.4%
544471
8.3%
743384
8.1%
643298
8.1%
Other values (3)71985
13.5%

Z8
Categorical

HIGH CARDINALITY

Distinct30563
Distinct (%)39.1%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
47532 
-26.1321290969268
 
2
-23.6332487422053
 
2
0
 
1
2.61576251372805
 
1
Other values (30558)
30558 

Length

Max length20
Median length1
Mean length7.107034931
Min length1

Characters and Unicode

Total characters555031
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30560 ?
Unique (%)39.1%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?47532
60.9%
-26.13212909692682
 
< 0.1%
-23.63324874220532
 
< 0.1%
01
 
< 0.1%
2.615762513728051
 
< 0.1%
-0.5651169170701971
 
< 0.1%
-0.5766972740344181
 
< 0.1%
-0.6070043730974911
 
< 0.1%
-47.66396465795211
 
< 0.1%
-0.07081673003573561
 
< 0.1%
Other values (30553)30553
39.1%

Length

2021-09-24T17:30:58.693410image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
47532
60.9%
26.13212909692682
 
< 0.1%
23.63324874220532
 
< 0.1%
01
 
< 0.1%
2.615762513728051
 
< 0.1%
0.5651169170701971
 
< 0.1%
0.5766972740344181
 
< 0.1%
0.6070043730974911
 
< 0.1%
47.66396465795211
 
< 0.1%
0.07081673003573561
 
< 0.1%
Other values (30553)30553
39.1%

Most occurring characters

ValueCountFrequency (%)
149919
9.0%
248513
8.7%
?47532
8.6%
346653
8.4%
446272
8.3%
545847
8.3%
745157
8.1%
644849
8.1%
844719
8.1%
943525
7.8%
Other values (3)92045
16.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number455662
82.1%
Other Punctuation78095
 
14.1%
Dash Punctuation21274
 
3.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
149919
11.0%
248513
10.6%
346653
10.2%
446272
10.2%
545847
10.1%
745157
9.9%
644849
9.8%
844719
9.8%
943525
9.6%
040208
8.8%
Other Punctuation
ValueCountFrequency (%)
?47532
60.9%
.30563
39.1%
Dash Punctuation
ValueCountFrequency (%)
-21274
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common555031
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
149919
9.0%
248513
8.7%
?47532
8.6%
346653
8.4%
446272
8.3%
545847
8.3%
745157
8.1%
644849
8.1%
844719
8.1%
943525
7.8%
Other values (3)92045
16.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII555031
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
149919
9.0%
248513
8.7%
?47532
8.6%
346653
8.4%
446272
8.3%
545847
8.3%
745157
8.1%
644849
8.1%
844719
8.1%
943525
7.8%
Other values (3)92045
16.6%

X9
Categorical

HIGH CARDINALITY

Distinct23969
Distinct (%)30.7%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
54128 
82.5186179394793
 
1
57.8859825192586
 
1
111.219492017796
 
1
59.0092930737533
 
1
Other values (23964)
23964 

Length

Max length20
Median length1
Mean length5.610389777
Min length1

Characters and Unicode

Total characters438149
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23968 ?
Unique (%)30.7%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?54128
69.3%
82.51861793947931
 
< 0.1%
57.88598251925861
 
< 0.1%
111.2194920177961
 
< 0.1%
59.00929307375331
 
< 0.1%
58.98227797223931
 
< 0.1%
71.99943367460611
 
< 0.1%
58.57979001991681
 
< 0.1%
58.49028995872941
 
< 0.1%
19.47729484772651
 
< 0.1%
Other values (23959)23959
30.7%

Length

2021-09-24T17:30:59.208083image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
54128
69.3%
82.51861793947931
 
< 0.1%
57.88598251925861
 
< 0.1%
111.2194920177961
 
< 0.1%
59.00929307375331
 
< 0.1%
58.98227797223931
 
< 0.1%
71.99943367460611
 
< 0.1%
58.57979001991681
 
< 0.1%
58.49028995872941
 
< 0.1%
19.47729484772651
 
< 0.1%
Other values (23959)23959
30.7%

Most occurring characters

ValueCountFrequency (%)
?54128
12.4%
138912
8.9%
836786
8.4%
636631
8.4%
736088
8.2%
335733
8.2%
235345
8.1%
434998
8.0%
534877
8.0%
934683
7.9%
Other values (3)59968
13.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number357022
81.5%
Other Punctuation78095
 
17.8%
Dash Punctuation3032
 
0.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
138912
10.9%
836786
10.3%
636631
10.3%
736088
10.1%
335733
10.0%
235345
9.9%
434998
9.8%
534877
9.8%
934683
9.7%
032969
9.2%
Other Punctuation
ValueCountFrequency (%)
?54128
69.3%
.23967
30.7%
Dash Punctuation
ValueCountFrequency (%)
-3032
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common438149
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
?54128
12.4%
138912
8.9%
836786
8.4%
636631
8.4%
736088
8.2%
335733
8.2%
235345
8.1%
434998
8.0%
534877
8.0%
934683
7.9%
Other values (3)59968
13.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII438149
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
?54128
12.4%
138912
8.9%
836786
8.4%
636631
8.4%
736088
8.2%
335733
8.2%
235345
8.1%
434998
8.0%
534877
8.0%
934683
7.9%
Other values (3)59968
13.7%

Y9
Categorical

HIGH CARDINALITY

Distinct23969
Distinct (%)30.7%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
54128 
132.005679813232
 
1
145.952177146186
 
1
43.9686789745966
 
1
145.557191610617
 
1
Other values (23964)
23964 

Length

Max length17
Median length1
Mean length5.571207232
Min length1

Characters and Unicode

Total characters435089
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23968 ?
Unique (%)30.7%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?54128
69.3%
132.0056798132321
 
< 0.1%
145.9521771461861
 
< 0.1%
43.96867897459661
 
< 0.1%
145.5571916106171
 
< 0.1%
145.5819612510611
 
< 0.1%
11.73191338115411
 
< 0.1%
145.699983417291
 
< 0.1%
145.6161292973741
 
< 0.1%
142.1874725418111
 
< 0.1%
Other values (23959)23959
30.7%

Length

2021-09-24T17:30:59.427385image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
54128
69.3%
132.0056798132321
 
< 0.1%
145.9521771461861
 
< 0.1%
43.96867897459661
 
< 0.1%
145.5571916106171
 
< 0.1%
145.5819612510611
 
< 0.1%
11.73191338115411
 
< 0.1%
145.699983417291
 
< 0.1%
145.6161292973741
 
< 0.1%
142.1874725418111
 
< 0.1%
Other values (23959)23959
30.7%

Most occurring characters

ValueCountFrequency (%)
?54128
12.4%
142798
9.8%
436490
8.4%
336297
8.3%
935403
8.1%
235353
8.1%
535136
8.1%
834691
8.0%
734523
7.9%
633717
7.7%
Other values (3)56553
13.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number356966
82.0%
Other Punctuation78095
 
17.9%
Dash Punctuation28
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
142798
12.0%
436490
10.2%
336297
10.2%
935403
9.9%
235353
9.9%
535136
9.8%
834691
9.7%
734523
9.7%
633717
9.4%
032558
9.1%
Other Punctuation
ValueCountFrequency (%)
?54128
69.3%
.23967
30.7%
Dash Punctuation
ValueCountFrequency (%)
-28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common435089
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
?54128
12.4%
142798
9.8%
436490
8.4%
336297
8.3%
935403
8.1%
235353
8.1%
535136
8.1%
834691
8.0%
734523
7.9%
633717
7.7%
Other values (3)56553
13.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII435089
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
?54128
12.4%
142798
9.8%
436490
8.4%
336297
8.3%
935403
8.1%
235353
8.1%
535136
8.1%
834691
8.0%
734523
7.9%
633717
7.7%
Other values (3)56553
13.0%

Z9
Categorical

HIGH CARDINALITY

Distinct23969
Distinct (%)30.7%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
54128 
-8.56945378061921
 
1
15.4318064856192
 
1
-58.7974819650175
 
1
15.5413043164263
 
1
Other values (23964)
23964 

Length

Max length20
Median length1
Mean length5.799067814
Min length1

Characters and Unicode

Total characters452884
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23968 ?
Unique (%)30.7%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?54128
69.3%
-8.569453780619211
 
< 0.1%
15.43180648561921
 
< 0.1%
-58.79748196501751
 
< 0.1%
15.54130431642631
 
< 0.1%
15.4960545629671
 
< 0.1%
-53.65548072799681
 
< 0.1%
15.14947276259681
 
< 0.1%
15.30048552987311
 
< 0.1%
24.51184635860341
 
< 0.1%
Other values (23959)23959
30.7%

Length

2021-09-24T17:30:59.641906image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
54128
69.3%
8.569453780619211
 
< 0.1%
15.43180648561921
 
< 0.1%
58.79748196501751
 
< 0.1%
15.54130431642631
 
< 0.1%
15.4960545629671
 
< 0.1%
53.65548072799681
 
< 0.1%
15.14947276259681
 
< 0.1%
15.30048552987311
 
< 0.1%
24.51184635860341
 
< 0.1%
Other values (23959)23959
30.7%

Most occurring characters

ValueCountFrequency (%)
?54128
12.0%
138269
8.5%
237978
8.4%
436618
8.1%
536363
8.0%
336128
8.0%
735909
7.9%
635592
7.9%
835219
7.8%
933753
7.5%
Other values (3)72927
16.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number357345
78.9%
Other Punctuation78095
 
17.2%
Dash Punctuation17444
 
3.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
138269
10.7%
237978
10.6%
436618
10.2%
536363
10.2%
336128
10.1%
735909
10.0%
635592
10.0%
835219
9.9%
933753
9.4%
031516
8.8%
Other Punctuation
ValueCountFrequency (%)
?54128
69.3%
.23967
30.7%
Dash Punctuation
ValueCountFrequency (%)
-17444
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common452884
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
?54128
12.0%
138269
8.5%
237978
8.4%
436618
8.1%
536363
8.0%
336128
8.0%
735909
7.9%
635592
7.9%
835219
7.8%
933753
7.5%
Other values (3)72927
16.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII452884
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
?54128
12.0%
138269
8.5%
237978
8.4%
436618
8.1%
536363
8.0%
336128
8.0%
735909
7.9%
635592
7.9%
835219
7.8%
933753
7.5%
Other values (3)72927
16.1%

X10
Categorical

HIGH CARDINALITY

Distinct14754
Distinct (%)18.9%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
63343 
68.0217323550982
 
1
-7.78173570091604
 
1
-7.51672271571658
 
1
-6.66363274225354
 
1
Other values (14749)
14749 

Length

Max length20
Median length1
Mean length3.839249129
Min length1

Characters and Unicode

Total characters299830
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14753 ?
Unique (%)18.9%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?63343
81.1%
68.02173235509821
 
< 0.1%
-7.781735700916041
 
< 0.1%
-7.516722715716581
 
< 0.1%
-6.663632742253541
 
< 0.1%
-6.58449463658381
 
< 0.1%
71.60443916163081
 
< 0.1%
70.70459839570341
 
< 0.1%
21.0128431076761
 
< 0.1%
21.16238611062351
 
< 0.1%
Other values (14744)14744
 
18.9%

Length

2021-09-24T17:30:59.864806image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
63343
81.1%
68.02173235509821
 
< 0.1%
7.781735700916041
 
< 0.1%
7.516722715716581
 
< 0.1%
6.663632742253541
 
< 0.1%
6.58449463658381
 
< 0.1%
71.60443916163081
 
< 0.1%
70.70459839570341
 
< 0.1%
21.0128431076761
 
< 0.1%
21.16238611062351
 
< 0.1%
Other values (14744)14744
 
18.9%

Most occurring characters

ValueCountFrequency (%)
?63343
21.1%
124406
 
8.1%
623346
 
7.8%
822168
 
7.4%
322053
 
7.4%
521787
 
7.3%
721676
 
7.2%
421438
 
7.2%
221366
 
7.1%
921038
 
7.0%
Other values (3)37209
12.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number219816
73.3%
Other Punctuation78095
 
26.0%
Dash Punctuation1919
 
0.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
124406
11.1%
623346
10.6%
822168
10.1%
322053
10.0%
521787
9.9%
721676
9.9%
421438
9.8%
221366
9.7%
921038
9.6%
020538
9.3%
Other Punctuation
ValueCountFrequency (%)
?63343
81.1%
.14752
 
18.9%
Dash Punctuation
ValueCountFrequency (%)
-1919
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common299830
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
?63343
21.1%
124406
 
8.1%
623346
 
7.8%
822168
 
7.4%
322053
 
7.4%
521787
 
7.3%
721676
 
7.2%
421438
 
7.2%
221366
 
7.1%
921038
 
7.0%
Other values (3)37209
12.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII299830
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
?63343
21.1%
124406
 
8.1%
623346
 
7.8%
822168
 
7.4%
322053
 
7.4%
521787
 
7.3%
721676
 
7.2%
421438
 
7.2%
221366
 
7.1%
921038
 
7.0%
Other values (3)37209
12.4%

Y10
Categorical

HIGH CARDINALITY

Distinct14754
Distinct (%)18.9%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
63343 
96.0203917458069
 
1
92.8959821759129
 
1
92.874392350055
 
1
92.8305816076823
 
1
Other values (14749)
14749 

Length

Max length18
Median length1
Mean length3.812320733
Min length1

Characters and Unicode

Total characters297727
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14753 ?
Unique (%)18.9%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?63343
81.1%
96.02039174580691
 
< 0.1%
92.89598217591291
 
< 0.1%
92.8743923500551
 
< 0.1%
92.83058160768231
 
< 0.1%
92.83458205959731
 
< 0.1%
10.90187173658631
 
< 0.1%
10.95336109236991
 
< 0.1%
97.78630709443631
 
< 0.1%
97.77764183423091
 
< 0.1%
Other values (14744)14744
 
18.9%

Length

2021-09-24T17:31:00.088366image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
63343
81.1%
96.02039174580691
 
< 0.1%
92.89598217591291
 
< 0.1%
92.8743923500551
 
< 0.1%
92.83058160768231
 
< 0.1%
92.83458205959731
 
< 0.1%
10.90187173658631
 
< 0.1%
10.95336109236991
 
< 0.1%
97.78630709443631
 
< 0.1%
97.77764183423091
 
< 0.1%
Other values (14744)14744
 
18.9%

Most occurring characters

ValueCountFrequency (%)
?63343
21.3%
125875
8.7%
322678
 
7.6%
822052
 
7.4%
921920
 
7.4%
421808
 
7.3%
721806
 
7.3%
521792
 
7.3%
221527
 
7.2%
620530
 
6.9%
Other values (3)34396
11.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number219613
73.8%
Other Punctuation78095
 
26.2%
Dash Punctuation19
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
125875
11.8%
322678
10.3%
822052
10.0%
921920
10.0%
421808
9.9%
721806
9.9%
521792
9.9%
221527
9.8%
620530
9.3%
019625
8.9%
Other Punctuation
ValueCountFrequency (%)
?63343
81.1%
.14752
 
18.9%
Dash Punctuation
ValueCountFrequency (%)
-19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common297727
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
?63343
21.3%
125875
8.7%
322678
 
7.6%
822052
 
7.4%
921920
 
7.4%
421808
 
7.3%
721806
 
7.3%
521792
 
7.3%
221527
 
7.2%
620530
 
6.9%
Other values (3)34396
11.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII297727
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
?63343
21.3%
125875
8.7%
322678
 
7.6%
822052
 
7.4%
921920
 
7.4%
421808
 
7.3%
721806
 
7.3%
521792
 
7.3%
221527
 
7.2%
620530
 
6.9%
Other values (3)34396
11.6%

Z10
Categorical

HIGH CARDINALITY

Distinct14754
Distinct (%)18.9%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
63343 
-11.669691780653
 
1
-0.649320721902374
 
1
-0.433422809872846
 
1
-0.049743270911728
 
1
Other values (14749)
14749 

Length

Max length20
Median length1
Mean length3.948819402
Min length1

Characters and Unicode

Total characters308387
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14753 ?
Unique (%)18.9%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?63343
81.1%
-11.6696917806531
 
< 0.1%
-0.6493207219023741
 
< 0.1%
-0.4334228098728461
 
< 0.1%
-0.0497432709117281
 
< 0.1%
-0.1257212099654771
 
< 0.1%
-49.09842534428361
 
< 0.1%
-49.49771638867261
 
< 0.1%
-0.6600401298630111
 
< 0.1%
-0.8067604029813291
 
< 0.1%
Other values (14744)14744
 
18.9%

Length

2021-09-24T17:31:00.310022image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
63343
81.1%
11.6696917806531
 
< 0.1%
0.6493207219023741
 
< 0.1%
0.4334228098728461
 
< 0.1%
0.0497432709117281
 
< 0.1%
0.1257212099654771
 
< 0.1%
49.09842534428361
 
< 0.1%
49.49771638867261
 
< 0.1%
0.6600401298630111
 
< 0.1%
0.8067604029813291
 
< 0.1%
Other values (14744)14744
 
18.9%

Most occurring characters

ValueCountFrequency (%)
?63343
20.5%
123177
 
7.5%
722744
 
7.4%
222729
 
7.4%
322414
 
7.3%
822370
 
7.3%
522196
 
7.2%
422069
 
7.2%
621968
 
7.1%
920755
 
6.7%
Other values (3)44622
14.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number220047
71.4%
Other Punctuation78095
 
25.3%
Dash Punctuation10245
 
3.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
123177
10.5%
722744
10.3%
222729
10.3%
322414
10.2%
822370
10.2%
522196
10.1%
422069
10.0%
621968
10.0%
920755
9.4%
019625
8.9%
Other Punctuation
ValueCountFrequency (%)
?63343
81.1%
.14752
 
18.9%
Dash Punctuation
ValueCountFrequency (%)
-10245
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common308387
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
?63343
20.5%
123177
 
7.5%
722744
 
7.4%
222729
 
7.4%
322414
 
7.3%
822370
 
7.3%
522196
 
7.2%
422069
 
7.2%
621968
 
7.1%
920755
 
6.7%
Other values (3)44622
14.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII308387
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
?63343
20.5%
123177
 
7.5%
722744
 
7.4%
222729
 
7.4%
322414
 
7.3%
822370
 
7.3%
522196
 
7.2%
422069
 
7.2%
621968
 
7.1%
920755
 
6.7%
Other values (3)44622
14.5%

X11
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct33
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
78064 
-6.9728851033097
 
1
-63.121001157804
 
1
-48.9705031160944
 
1
-47.3056686828518
 
1
Other values (28)
 
28

Length

Max length17
Median length1
Mean length1.006133477
Min length1

Characters and Unicode

Total characters78575
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?78064
> 99.9%
-6.97288510330971
 
< 0.1%
-63.1210011578041
 
< 0.1%
-48.97050311609441
 
< 0.1%
-47.30566868285181
 
< 0.1%
-48.4547373396961
 
< 0.1%
-47.49898860700891
 
< 0.1%
23.77517738462331
 
< 0.1%
01
 
< 0.1%
-63.289120369441
 
< 0.1%
Other values (23)23
 
< 0.1%

Length

2021-09-24T17:31:00.528513image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
78064
> 99.9%
6.97288510330971
 
< 0.1%
63.1210011578041
 
< 0.1%
48.97050311609441
 
< 0.1%
47.30566868285181
 
< 0.1%
48.4547373396961
 
< 0.1%
47.49898860700891
 
< 0.1%
23.77517738462331
 
< 0.1%
01
 
< 0.1%
63.289120369441
 
< 0.1%
Other values (23)23
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
?78064
99.3%
458
 
0.1%
655
 
0.1%
849
 
0.1%
948
 
0.1%
347
 
0.1%
747
 
0.1%
142
 
0.1%
541
 
0.1%
237
 
< 0.1%
Other values (3)87
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation78095
99.4%
Decimal Number457
 
0.6%
Dash Punctuation23
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
458
12.7%
655
12.0%
849
10.7%
948
10.5%
347
10.3%
747
10.3%
142
9.2%
541
9.0%
237
8.1%
033
7.2%
Other Punctuation
ValueCountFrequency (%)
?78064
> 99.9%
.31
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
-23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common78575
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
?78064
99.3%
458
 
0.1%
655
 
0.1%
849
 
0.1%
948
 
0.1%
347
 
0.1%
747
 
0.1%
142
 
0.1%
541
 
0.1%
237
 
< 0.1%
Other values (3)87
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII78575
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
?78064
99.3%
458
 
0.1%
655
 
0.1%
849
 
0.1%
948
 
0.1%
347
 
0.1%
747
 
0.1%
142
 
0.1%
541
 
0.1%
237
 
< 0.1%
Other values (3)87
 
0.1%

Y11
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct33
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
78064 
-21.8420931891443
 
1
28.9722746932247
 
1
38.9380997909788
 
1
41.0216992158116
 
1
Other values (28)
 
28

Length

Max length17
Median length1
Mean length1.006018234
Min length1

Characters and Unicode

Total characters78566
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?78064
> 99.9%
-21.84209318914431
 
< 0.1%
28.97227469322471
 
< 0.1%
38.93809979097881
 
< 0.1%
41.02169921581161
 
< 0.1%
39.55219836073281
 
< 0.1%
40.57392315994191
 
< 0.1%
-65.43214309391921
 
< 0.1%
01
 
< 0.1%
28.47483912563571
 
< 0.1%
Other values (23)23
 
< 0.1%

Length

2021-09-24T17:31:00.744450image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
78064
> 99.9%
21.84209318914431
 
< 0.1%
28.97227469322471
 
< 0.1%
38.93809979097881
 
< 0.1%
41.02169921581161
 
< 0.1%
39.55219836073281
 
< 0.1%
40.57392315994191
 
< 0.1%
65.43214309391921
 
< 0.1%
01
 
< 0.1%
28.47483912563571
 
< 0.1%
Other values (23)23
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
?78064
99.4%
255
 
0.1%
154
 
0.1%
953
 
0.1%
751
 
0.1%
349
 
0.1%
846
 
0.1%
445
 
0.1%
542
 
0.1%
638
 
< 0.1%
Other values (3)69
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation78095
99.4%
Decimal Number465
 
0.6%
Dash Punctuation6
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
255
11.8%
154
11.6%
953
11.4%
751
11.0%
349
10.5%
846
9.9%
445
9.7%
542
9.0%
638
8.2%
032
6.9%
Other Punctuation
ValueCountFrequency (%)
?78064
> 99.9%
.31
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
-6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common78566
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
?78064
99.4%
255
 
0.1%
154
 
0.1%
953
 
0.1%
751
 
0.1%
349
 
0.1%
846
 
0.1%
445
 
0.1%
542
 
0.1%
638
 
< 0.1%
Other values (3)69
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII78566
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
?78064
99.4%
255
 
0.1%
154
 
0.1%
953
 
0.1%
751
 
0.1%
349
 
0.1%
846
 
0.1%
445
 
0.1%
542
 
0.1%
638
 
< 0.1%
Other values (3)69
 
0.1%

Z11
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct33
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size610.2 KiB
?
78064 
-6.36681563499585
 
1
15.0070913220952
 
1
13.0668262962117
 
1
10.9171233250392
 
1
Other values (28)
 
28

Length

Max length17
Median length1
Mean length1.006043843
Min length1

Characters and Unicode

Total characters78568
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?78064
> 99.9%
-6.366815634995851
 
< 0.1%
15.00709132209521
 
< 0.1%
13.06682629621171
 
< 0.1%
10.91712332503921
 
< 0.1%
12.31004692570581
 
< 0.1%
11.41669712778841
 
< 0.1%
-8.957752479308911
 
< 0.1%
01
 
< 0.1%
15.25707096614831
 
< 0.1%
Other values (23)23
 
< 0.1%

Length

2021-09-24T17:31:00.959472image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
78064
> 99.9%
6.366815634995851
 
< 0.1%
15.00709132209521
 
< 0.1%
13.06682629621171
 
< 0.1%
10.91712332503921
 
< 0.1%
12.31004692570581
 
< 0.1%
11.41669712778841
 
< 0.1%
8.957752479308911
 
< 0.1%
01
 
< 0.1%
15.25707096614831
 
< 0.1%
Other values (23)23
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
?78064
99.4%
166
 
0.1%
255
 
0.1%
948
 
0.1%
546
 
0.1%
644
 
0.1%
343
 
0.1%
042
 
0.1%
842
 
0.1%
440
 
0.1%
Other values (3)78
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation78095
99.4%
Decimal Number464
 
0.6%
Dash Punctuation9
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
166
14.2%
255
11.9%
948
10.3%
546
9.9%
644
9.5%
343
9.3%
042
9.1%
842
9.1%
440
8.6%
738
8.2%
Other Punctuation
ValueCountFrequency (%)
?78064
> 99.9%
.31
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
-9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common78568
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
?78064
99.4%
166
 
0.1%
255
 
0.1%
948
 
0.1%
546
 
0.1%
644
 
0.1%
343
 
0.1%
042
 
0.1%
842
 
0.1%
440
 
0.1%
Other values (3)78
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII78568
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
?78064
99.4%
166
 
0.1%
255
 
0.1%
948
 
0.1%
546
 
0.1%
644
 
0.1%
343
 
0.1%
042
 
0.1%
842
 
0.1%
440
 
0.1%
Other values (3)78
 
0.1%

Interactions

2021-09-24T17:30:37.728200image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:37.836008image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:37.938421image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:38.037218image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:38.138080image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:38.233106image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:38.331111image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:38.428273image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:38.524906image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:38.627681image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:38.725758image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:38.819614image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:38.921673image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:39.025379image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:39.131838image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:39.234082image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:39.335125image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:39.442457image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:39.544312image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:39.813971image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:39.928284image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:40.028445image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:40.130438image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:40.228787image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:40.332652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:40.435307image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:40.534990image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:40.636597image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:40.738578image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:40.840123image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:40.941241image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:41.042308image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:41.143909image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:41.240771image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:41.336973image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:41.440259image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:41.538224image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:41.636134image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:41.733650image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:41.832773image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:41.933886image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:42.030776image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:42.132266image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:42.228725image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:42.322937image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:42.416370image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:42.512959image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:42.608563image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:42.703893image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:42.794948image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:42.893589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:42.988574image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:43.081789image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:43.178903image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:43.273045image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:43.505623image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:43.617578image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:43.721743image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:43.822848image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:43.925401image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:44.022957image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:44.123292image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:44.222783image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:44.323458image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:44.427165image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:44.527195image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:44.623691image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:44.718989image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:44.817163image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:44.918022image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:45.013911image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:45.109093image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:45.208195image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:45.302647image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:45.402209image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:45.500843image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:45.599708image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:45.696677image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:45.792885image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:45.895326image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:45.994521image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:46.092676image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:46.190494image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:46.288888image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:46.384933image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:46.481842image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:46.581969image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:46.679172image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:46.772724image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:46.870979image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:46.974710image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:47.076431image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:47.178491image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:47.275880image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:47.377353image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:47.477955image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:47.576685image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:47.677797image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:47.777202image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:48.050518image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:48.160597image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:48.261025image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:48.360933image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:48.459750image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:48.560278image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:48.665447image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:48.761713image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:48.857654image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:48.955048image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:49.051598image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:49.153666image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:49.246819image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:49.346801image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:49.446995image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:49.542446image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:49.634785image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:49.729992image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:49.823720image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:49.918622image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:50.013444image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-24T17:30:50.105602image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2021-09-24T17:31:01.052769image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-09-24T17:31:01.179042image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-09-24T17:31:01.302486image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-09-24T17:31:01.435400image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-09-24T17:31:01.545496image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-09-24T17:30:50.439491image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2021-09-24T17:30:51.382031image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

ClassUserX0Y0Z0X1Y1Z1X2Y2Z2X3Y3Z3X4Y4Z4X5Y5Z5X6Y6Z6X7Y7Z7X8Y8Z8X9Y9Z9X10Y10Z10X11Y11Z11
0000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000000000000000000000000000000
11054.26388071.466776-64.80770976.89563542.462500-72.78054536.62122981.680557-52.91927285.232263885291767.7492195028673-73.68413004183359.188575702788710.6789364098231-71.2977813147725?????????????????????
21056.52755872.266609-61.93525239.13597882.538530-49.59650979.22374343.254091-69.98248987.450872946962568.4008083028339-70.70399092595961.587451553275311.7799190329758-68.827417756239?????????????????????
31055.84992872.469064-62.56278837.98880482.631347-50.60625978.45152643.567403-70.65848986.835387568076268.9079249764243-71.138344136573961.686427191057611.7934398850428-68.88931646056?????????????????????
41055.32964771.707275-63.68895636.56186381.868749-52.75278486.32063068.214645-72.22846161.596157128897811.2506481750465-68.956425230743177.387225412391242.7178334810919-72.0151462991019?????????????????????
51055.14240171.435607-64.17730336.17581881.556874-53.47574776.98614342.426849-72.57474386.368748060576567.9012603746826-72.444649964816961.275402195971410.8411094568665-69.2799064015993?????????????????????
61055.58118471.641201-63.70313734.85056581.352041-54.74744377.07851242.548245-72.48548986.851331683302968.0118361327612-71.909937730697561.856846964914410.851973378857-68.8537517321284?????????????????????
71034.52282481.457317-54.90099555.82768771.878788-63.19436886.90265368.312680-71.64207461.82952651594411.0149780917446-68.958796290705176.954507416039842.7346392143395-72.500618925002?????????????????????
81061.62155010.968187-69.13403732.67817381.172874-56.99436286.73236868.308089-71.83400376.829192174439342.7382282883737-72.6301463414381????????????????????????
91061.40135611.014961-69.37941832.52764381.127660-57.09247386.42106668.405649-72.12216177.146546060716842.8279620708044-72.261478907860655.609708200423372.0741963345598-63.1884658182293?????????????????????

Last rows

ClassUserX0Y0Z0X1Y1Z1X2Y2Z2X3Y3Z3X4Y4Z4X5Y5Z5X6Y6Z6X7Y7Z7X8Y8Z8X9Y9Z9X10Y10Z10X11Y11Z11
7808651454.908781129.303806-42.93431278.31805233.683337-43.840576-1.94085297.8176664.01510727.6853470233534107.57301484038911.154426695944226.3506009552266136.579983974067-31.6351880294445-30.165900132398277.4294901816379-17.675794899992987.938726278463462.8677169867047-58.548465757452661.9541345674737100.610016551058-7.06207948015619-2.15405383559464124.472009502534-42.5991290237987-21.5198133745536103.878992733138-55.2438149453613??????
7808751454.551317129.518766-42.78810378.16435333.809802-44.171496-1.87231697.6224653.82548827.6592436354223107.49368076120911.124145356441887.7570141827762.9020519815344-58.7952094666383-30.389097851845177.6075268475593-17.066599585361626.315942298426136.460496233348-31.87275886218361.6629807631553100.864118464229-6.70149250888009-2.04360732206045124.11591040336-43.3134903419642-22.127474597714104.456581296732-53.799793656214??????
7808851454.445660129.540360-42.997068-1.72931097.5088063.87457487.41016663.000213-59.24552678.02552776066833.8348766953659-44.564149074774926.0299827080131136.857504352242-31.205443372632627.7183895417336107.49633816905811.317220920714661.8304214663095100.774006600052-6.88473300930496-30.071973347486377.4028439439658-17.2128783407945-1.26918178096319123.323333356399-45.1570302775825-22.5278299293037104.843374144626-52.9224461703548??????
7808951454.169153129.613174-43.02982577.62232634.047057-44.09606027.649768107.62132611.517868-1.9347240284607297.81406142345144.3670507254873125.929106013029136.743776294852-31.6956099109053-30.268124782384677.6194788360856-16.858413099941387.725639174250361.8523978389894-61.2264753501689-1.34581059889933123.180693777383-45.523667020953361.43480398692101.174811346254-6.02077840676016-22.7966202281261104.919796414431-52.642591884814??????
7809051454.082504129.618681-43.27730977.32221234.205727-43.83590927.685119107.73376611.462023-1.38676063977763123.071580851701-45.8603868136038-1.9440556934527798.00839893534554.3492813964536425.883687004504136.662550671203-32.1218647417411-30.276513977177477.6137632686951-16.968827792722861.3400914772629101.369663763961-5.9448483749332387.503984505236661.7788546210994-61.3485641400293-22.9731363475939104.912996475043-52.5901999876606??????
7809151454.251127129.177414-44.25251127.720784107.81066111.099282-1.270139122.758679-46.460186-1.901939773590998.05688111920964.0750182554034326.0308784490985136.368236497962-32.792392596589577.293709806663134.0708569530435-43.2637900001125-30.282201332142677.6223061447627-17.15029172919-22.8338299139098104.593912875762-53.126135522878761.4949522101345101.205748000188-6.5431145335446487.733364309622161.2163626180662-62.1262783028233??????
7809251454.334883129.253842-44.01632027.767911107.91480811.069842-30.33405477.858214-17.002723-22.7439236254982104.726271994453-53.007243704520726.4469428817139135.82349439399-33.8514248055691-1.09067420384509122.724914373665-46.5583729086904-1.9585022044205998.28893115550664.2483336979389661.3900145605159101.468284194012-5.9853708064933488.177233156254860.7806417592771-62.130562571214578.229847391913233.0784071753174-45.2495433686171??????
7809351454.151540129.269502-44.17327327.725978108.03400611.020347-22.574718104.222208-53.939140-30.225674904035377.6889105615702-17.354843688573626.5066400161928135.605902859879-34.441462875567161.4753314934401101.321181459117-6.61698844268879-1.12573832398984122.6140268058-46.796054150944388.291028702848260.3627679995225-62.7187354592767-1.7794949037679198.08945926789553.8359037058878278.59155401054232.32773753847-46.1665560916632??????
7809451427.915311108.00739010.814957-0.910435122.464093-47.271248-30.08458877.705861-17.46085326.7159145624784135.523929259824-34.6757739677746-22.3294952272238104.026431451371-54.334622687210761.6393864623392101.224661857702-6.840941810990554.0099616832869129.477879104126-43.9678201146984-1.5814149294360298.01741116203263.3942913131192488.465709372069160.0507908377745-63.2216259324485?????????
7809551427.898705108.09287711.107857-30.03140277.740235-17.453099-1.091566122.827638-46.76098553.9653329040079129.602373686175-43.6796644772844-1.5588457872719398.01720072504343.4218137696077226.5869101964665135.762921713685-34.4459435415338-22.3160291864615104.124677433799-54.344078680543261.4542004587843101.466350577221-6.3423021574978388.347447419981960.2081094108734-63.026574415522278.879652566601431.6030997578398-47.0137357205196??????